Report on the NTCIR-12 MedNLPDoc Task Results

نویسندگان

  • Dina Vishnyakova
  • Christophe Gaudet-Blavigniac
  • Selen Bozkurt
  • David-Zacharie Issom
  • Renat Vishnyakov
  • Christian Lovis
چکیده

The reuse of clinical data for the research environment is becoming one of the important tasks in medical informatics. The automatic assignment of the medical codes to the pre-identified concepts is turning to the Sisyphean task. For the MedNLP task in NTCIR-12 a new approach to automatically enrich the dictionary using online data is proposed. We have developed a text-mining system able to treat medical textual data represented in Japanese language and assign ICD-10 codes with English descriptors to the identified concepts. There are three main parts in the functionality of the system: 1) English version of ICD-10-based dictionary, 2) Wikipedia-based synonyms 3) statistical translation tools such Yandex and Google Translate APIs. This report presents the description of the system and the achieved results on the MedNLPDoc test data. Additionally, we provide an ICD assignation frequency in University Hospitals of Geneva. General Terms Algorithms, Standardization, Languages, Theory, Legal Aspects

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NARS: NTCIR-12 MedNLPDoc Baseline

NTCIR-12 MedNLPDoc is a shared task of ICD coding task, which is a multi-labeling task to a patient medical record. This paper describes the baseline system of the task. The system is based on the simple word match with a disease name dictionary without any use of training data. This report presents the results of the baseline system, and discusses the basic feasibility of this system.

متن کامل

Similarity Matrix Model for the NTCIR-12 MedNLPDoc Task

We participated in the NTCIR-12 MedNLPDoc phenotyping task. In this paper, we describe our approach for this task. The core part of our model is a similarity matrix model in which each element has a local similarity value between n-grams from a disease name and a medical record. We conduct an experiment to evaluate the effectiveness of our method. We report the results of our preliminary experi...

متن کامل

Overview of the NTCIR-12 MedNLPDoc Task

Due to the recent replacements of physical documents with electronic medical records (EMR), the importance of information processing in medical fields has been increased. We have been organizing the MedNLP task series in NTCIR-10 and 11. These workshops were the first shared tasks which attempt to evaluate technologies that retrieve important information from medical reports written in Japanese...

متن کامل

MedNLPDoc: Japanese Shared Task for Clinical NLP

Due to the recent replacements of physical documents with electronic medical records (EMR), the importance of information processing in medical fields has been increased. We have been organizing the MedNLP task series in NTCIR-10 and 11. These workshops were the first shared tasks which attempt to evaluate technologies that retrieve important information from medical reports written in Japanese...

متن کامل

Team-Nikon at NTCIR-12 MedNLPDoc Task

The phenotyping task of the NTCIR-12 MedNLPDoc Task [1] is a multi-labeling task retrieved from Japanese medical records. The team-Nikon participated in this task and proposed a new method that assigns the ICD codes by using Information Retrieval (IR) and reduces the magnitude of mistaken coding by using machine learning. When evaluated on development set, our system achieved F-scores of 29.2% ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016